Picture for Xiaoye Qu

Xiaoye Qu

May

Not All Inputs Are Valid: Towards Open-Set Video Moment Retrieval Using Language

Add code
May 28, 2026
Viaarxiv icon

VPG: Visual Prefix Guidance for Autoregressive Image and Video Generation

Add code
May 28, 2026
Viaarxiv icon

Draft-OPD: On-Policy Distillation for Speculative Draft Models

Add code
May 28, 2026
Viaarxiv icon

Rethinking Video-Language Model from the Language Input Perspective

Add code
May 27, 2026
Viaarxiv icon

Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective

Add code
May 26, 2026
Viaarxiv icon

$π$-Bench: Evaluating Proactive Personal Assistant Agents in Long-Horizon Workflows

Add code
May 14, 2026
Viaarxiv icon

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Add code
May 13, 2026
Viaarxiv icon

VCE: A zero-cost hallucination mitigation method of LVLMs via visual contrastive editing

Add code
Apr 21, 2026
Viaarxiv icon

GEMS: Agent-Native Multimodal Generation with Memory and Skills

Add code
Mar 30, 2026
Viaarxiv icon

ExFusion: Efficient Transformer Training via Multi-Experts Fusion

Add code
Mar 30, 2026
Viaarxiv icon